Covid-19 Data

Column

Data

Covid-19 datahub

World-wide Covid-19 data will be downloaded using the COVID19 R package. This package is able to download COVID-19 data across governmental sources at national, regional, and city level, as described in Guidotti and Ardia (2020) (doi:10.21105/joss.02376). It also includes policy measures by ‘Oxford COVID-19 Government Response Tracker’ (https://www.bsg.ox.ac.uk/research/research-projects/coronavirus-governmentresponse-tracker)

For more info on this unified dataset, visit their data hub (https://covid19datahub.io/).

Info

The covid19 datasets contain a lot of information such as:

  • confirmed cases (confirmed)
  • number of deaths (deaths)
  • number of hospitalized patients (hosp)
  • The level at which the numbers were recorded
    • administrative_area_level_1 for data1
    • administrative_area_level_2 for data2
  • population size of administrative_area_level_1 or administrative_area_level_2 (population)
  • numerous restrictions (school_closing, cancel_events,gathering_restrictions,…)
  • number of recovered cases (recovered)
  • …

Column

Structure

'data.frame':   3645 obs. of  47 variables:
 $ id                                 : chr  "1bb2de77" "1bb2de77" "1bb2de77" "1bb2de77" ...
 $ date                               : Date, format: "2020-03-01" "2020-03-02" ...
 $ confirmed                          : int  6 10 12 17 19 28 33 39 51 64 ...
 $ deaths                             : int  NA NA NA NA NA NA 1 NA NA 2 ...
 $ recovered                          : int  NA NA NA NA NA NA NA NA NA NA ...
 $ tests                              : int  4 21 50 87 121 203 226 248 303 357 ...
 $ vaccines                           : int  NA NA NA NA NA NA NA NA NA NA ...
 $ people_vaccinated                  : int  NA NA NA NA NA NA NA NA NA NA ...
 $ people_fully_vaccinated            : int  NA NA NA NA NA NA NA NA NA NA ...
 $ hosp                               : int  NA NA NA NA NA NA NA NA NA 4 ...
 $ icu                                : int  NA NA NA NA NA NA NA NA NA 0 ...
 $ vent                               : int  NA NA NA NA NA NA NA NA NA 1 ...
 $ school_closing                     : int  0 0 0 0 0 0 0 0 0 0 ...
 $ workplace_closing                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ cancel_events                      : int  0 0 0 0 0 0 0 0 0 1 ...
 $ gatherings_restrictions            : int  0 0 0 0 0 0 0 0 0 0 ...
 $ transport_closing                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ stay_home_restrictions             : int  0 0 0 0 0 0 0 0 0 0 ...
 $ internal_movement_restrictions     : int  0 0 0 0 0 0 0 0 0 0 ...
 $ international_movement_restrictions: int  0 0 0 1 1 1 1 1 1 1 ...
 $ information_campaigns              : int  2 2 2 2 2 2 2 2 2 2 ...
 $ testing_policy                     : int  1 1 1 1 1 1 1 1 1 1 ...
 $ contact_tracing                    : int  2 2 2 2 2 2 2 2 2 2 ...
 $ facial_coverings                   : int  0 0 0 0 0 0 0 0 0 0 ...
 $ vaccination_policy                 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ elderly_people_protection          : int  0 0 0 0 0 0 0 0 0 0 ...
 $ government_response_index          : num  -14.6 -14.6 -14.6 -16.1 -16.1 ...
 $ stringency_index                   : num  -11.1 -11.1 -11.1 -13.9 -13.9 ...
 $ containment_health_index           : num  -16.7 -16.7 -16.7 -18.4 -18.4 ...
 $ economic_support_index             : num  0 0 0 0 0 -62.5 -62.5 -62.5 -62.5 -62.5 ...
 $ administrative_area_level          : int  2 2 2 2 2 2 2 2 2 2 ...
 $ administrative_area_level_1        : chr  "Belgium" "Belgium" "Belgium" "Belgium" ...
 $ administrative_area_level_2        : chr  "Bruxelles" "Bruxelles" "Bruxelles" "Bruxelles" ...
 $ administrative_area_level_3        : chr  NA NA NA NA ...
 $ latitude                           : num  50.8 50.8 50.8 50.8 50.8 ...
 $ longitude                          : num  4.37 4.37 4.37 4.37 4.37 ...
 $ population                         : int  1208542 1208542 1208542 1208542 1208542 1208542 1208542 1208542 1208542 1208542 ...
 $ iso_alpha_3                        : chr  "BEL" "BEL" "BEL" "BEL" ...
 $ iso_alpha_2                        : chr  "BE" "BE" "BE" "BE" ...
 $ iso_numeric                        : int  56 56 56 56 56 56 56 56 56 56 ...
 $ iso_currency                       : chr  "EUR" "EUR" "EUR" "EUR" ...
 $ key_local                          : logi  NA NA NA NA NA NA ...
 $ key_google_mobility                : chr  "ChIJ_58PdIbEw0cRMIBML6uZAAE" "ChIJ_58PdIbEw0cRMIBML6uZAAE" "ChIJ_58PdIbEw0cRMIBML6uZAAE" "ChIJ_58PdIbEw0cRMIBML6uZAAE" ...
 $ key_apple_mobility                 : chr  "Brussels" "Brussels" "Brussels" "Brussels" ...
 $ key_jhu_csse                       : chr  NA NA NA NA ...
 $ key_nuts                           : chr  "BE1" "BE1" "BE1" "BE1" ...
 $ key_gadm                           : chr  "BEL.1_1" "BEL.1_1" "BEL.1_1" "BEL.1_1" ...

Exercise 1

Column

Question & Code

Use data1 to visualize the daily confirmed cases in Belgium over time with a colored line. Also make sure all months (with year) appear on the x-axis and give the graph a title.

# Enter your solution here

ggplot(data1,aes(x=date,y=confirmed_daily,col=administrative_area_level_1)) + 
  geom_line() + 
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
  theme_bw() + labs(title="Daily confirmed cases in Belgium") +
  theme(axis.text.x = element_text(angle=60,hjust=1))

You will see that the line of confirmed daily cases jumps up and down consistently due to a weekend effect. Add a smooth line through these data points instead. (Tip: a span of 0.2 for loess gives a good result)

# Enter your solution here

ggplot(data1,aes(x=date,y=confirmed_daily,col=administrative_area_level_1)) + 
  geom_line() + 
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
  theme_bw() + labs(title="Daily confirmed cases in Belgium") +
  theme(axis.text.x = element_text(angle=60,hjust=1)) + 
  geom_smooth(span=0.2, se=FALSE)

Finally, use your previous code to make 2 plots: one with the line fitting the data directly and one with the smoothed line. Combine these plots into a single plot grid using either the gridExtra or cowplotpackage.

Plot

Exercise 2

Row

Overview

  • Data: data2
  • Variables of interest: date, hosp, administrative_area_level_2
  • Functions/tips: geom_line, facet_wrap, scale_color_manual

Row

Questions & Code

For this and all following exercises you will be using data2. Visualize the number of hospitalizations over time in the 3 main regions of Belgium. Make sure the 3 regions are separated in 3 facets and give each line (for each region) a manual color! (pick your favorite colors)

# Enter your solution here

ggplot(data2, aes(x=date, y=hosp, col=administrative_area_level_2)) + 
  geom_line(size=1) + facet_wrap(~administrative_area_level_2) + 
  theme_bw() + labs(y="Number of Hospitalizations") + 
  scale_color_manual(values=c("black","gold","red"))

Change your coloring variable to gatherings_restrictions. Make sure discrete colors are used and not a gradient! Also make sure that you obtain a single line with multiple colors and not multiple lines with a single color. If you are interested in what the multiple level of restrictions mean, check out https://github.com/OxCGRT/covid-policy-tracker/blob/master/documentation/codebook.md#containment-and-closure-policies . (Tip: group=1)

# Enter your solution here

ggplot(data2, aes(x=date, y=hosp, col=factor(gatherings_restrictions))) + 
  geom_line(size=1,aes(group=1)) + facet_wrap(~administrative_area_level_2) + 
  theme_bw() + labs(y="Number of Hospitalizations") +
  theme(axis.text.x = element_text(angle=60,hjust=1))

Row

Plot 1

Plot 2

Exercises plotly

Exercise 3

For this visualization first recreate a similar plot as in exercise 2 (you should be able to recycle most of your earlier code). However this time, color by region, do not use facets, and make sure the y-axis shows number of hospitalizations over the regional population size. The latter makes the values more comparable between regions.

  • Data: data2
  • Variables of interest: date, hosp, administrative_area_level_2, population
  • Functions/tips: geom_line, ggplotly

Now transform your ggplot in an interactive plot and add the following additional tooltips: (1) Total number of hospitalizations, (2) regional population size, (3) any other number of variables that are of interest to you (e.g. restrictions) (Tip: group=administrative_area_level_2)

# Enter your solution here

p <- ggplot(data2, aes(x=date, y=hosp/population, col=administrative_area_level_2,
                       text=paste0("Total Hospitalized: ",hosp,"\nPopulation: ",population,"\nSchool Closing: ",school_closing,"\nGathering Restriction: ",gatherings_restrictions))) + 
  geom_line(size=1,aes(group=administrative_area_level_2)) + # Plotly needs to know what each line represents
  theme_bw() + labs(y="Number of Hospitalizations / Regional Population Size",title="Hospitalization in Belgium") +
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y")) + theme(axis.text.x = element_text(angle=60,hjust=1)) + 
  scale_color_manual(values=c("black","gold","red"))

ggplotly(p)

Plot

Exercises gganimate

Exercise 4

Use the static plot of exercise 3 and animate it however you see fit following one of the approaches in the course slides. You do not need to stick to the line plot.

  • Data: data2
  • Variables of interest: date, hosp, administrative_area_level_2, population
  • Functions/tips: geom_line, geom_point, transition_reveal, transition_time, transition_states
# Enter your solution here

p1 <- ggplot(data2, aes(x=date, y=hosp/population, col=administrative_area_level_2)) + 
  geom_line(size=1) + geom_point(size=2) +
  theme_bw() + labs(y="Number of Hospitalizations / Regional Population Size",title="Hospitalization in Belgium: {frame_along}") +
  scale_color_manual(values=c("black","gold","red")) + 
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y")) + theme(axis.text.x = element_text(angle=60,hjust=1)) +
  transition_reveal(date)

animate(p1, width=800,height=400)

# Note: We could have just have used the ggplot statement without saving it in `p1` and then using `animate()`. However, doing it in 2 steps, gives us more control over animation options such as width, height, number of frames, rewind, fps,...

Plot

Bonus Exercise

Column

Question

From scratch, use data3 to create a plot that shows the deaths per 100.000 people for every week over time for all of the included countries (Belgium, Czech Republic, France, Germany, Netherlands, United Kingdom). Either make this plot interactive by adding tooltips or animate with your favorite animation.

You will need to do some data manipulation to add the daily deaths and the weekly deaths per 100.000. If you get stuck here, don’t be afraid to head over to your best friend google/stackoverflow to find an easy/creative solution.

Column

Code

# Enter your solution here

# Add daily deaths (the same as daily confirmed was added above)
data3 <- data3 %>% arrange(date) %>% group_by(administrative_area_level_1) %>% mutate(deaths_daily=c(.data$deaths[1],diff(deaths))) %>% ungroup()

# Add a date_weeks variable that is the first day at the beginning of every week. We use the round_date() function from the lubridate package for this.
# Then, we summarize the daily deaths over date_week
data3 <- data3 %>% mutate(date_week = lubridate::round_date(date,"week")) %>% 
  group_by(administrative_area_level_1, date_week) %>%
  mutate(deaths_week = sum(deaths_daily))


p <- ggplot(data3, aes(x=date_week, y=deaths_week/population*100000, col=administrative_area_level_1,
                  text=paste0("confirmed: ",confirmed,"\ndeaths: ",deaths))) + 
  geom_line(size=1,aes(group=administrative_area_level_1)) + 
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
  theme_bw() + labs(title="Weekly deaths per 100.000", y="Biweekly deaths per 100.000",x="") +
  theme(axis.text.x = element_text(angle=60,hjust=1))

ggplotly(p)

Plot

---
title: "Flexdashboard: Interactive & Animated Plotting"
output: 
  flexdashboard::flex_dashboard:
    vertical_layout: fill
    theme: yeti
    source_code: embed
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
```

```{r loadPackages, include=FALSE}
# install.packages("gifski") # Make sure the gifski package is installed as well. It creates a gif out of the frames gganimate produces

library(ggplot2)
library(plotly)
library(tidyr)
library(dplyr)
library(gganimate)
library(gridExtra)
library(cowplot)

library(COVID19) # To download data

```


```{r getData, message=FALSE, warning=FALSE}
# Get COVID19 data from Belgium at national level
data1 <- covid19(country=c("Belgium"),start="2020-03-01", verbose=FALSE)
data1 <- data1[data1$date<=(Sys.Date()-2),] # Remove last 2 days, may still be incomplete
data1 <- data1 %>% arrange(date) %>% group_by(administrative_area_level_1) %>% mutate(confirmed_daily=c(data1$confirmed[1],diff(confirmed))) %>% ungroup() # Add daily confirmed cases


# Get COVID19 data from Belgium at regional level
data2 <- covid19(country=c("Belgium"),start="2020-03-01",level=2, verbose=FALSE)
data2 <- data2[data2$date<=(Sys.Date()-2),] # Remove last 2 days, may still be incomplete
data2 <- data2[data2$administrative_area_level_2!="Ostbelgien",] # We remove the Ostbelgien entries since they do not contain the number of confirmed cases or number of hospitalisations


# Get COVID19 data from multiple countries at national level
data3 <- covid19(country=c("Belgium","CZE","Netherlands","France","Germany","United Kingdom"),start="2020-03-01", verbose=FALSE)
data3 <- data3[data3$date<=(Sys.Date()-2),] # Remove last 2 days, may still be incomplete 


# In case the code above fails to download the data, please load a pre-downloaded version here:
# load(".../data/covid19_belgium.RData")
```


Covid-19 Data 
======================


Column
--------------

### Data {.no-title data-height=500}

<p align="center">
[![Covid-19 datahub](logo.png){width=200}](https://covid19datahub.io/)
</p>

World-wide Covid-19 data will be downloaded using the `COVID19` R package. This package is able to download COVID-19 data across governmental sources at national, regional, and city level, as described in Guidotti and Ardia (2020) (doi:10.21105/joss.02376). It also includes policy measures by 'Oxford COVID-19 Government Response Tracker' (https://www.bsg.ox.ac.uk/research/research-projects/coronavirus-governmentresponse-tracker)

For more info on this unified dataset, visit their data hub (https://covid19datahub.io/).


### Info {data-height=500}

The covid19 datasets contain a lot of information such as:

* confirmed cases (`confirmed`)
* number of deaths (`deaths`)
* number of hospitalized patients (`hosp`)
* The level at which the numbers were recorded
    + `administrative_area_level_1` for `data1`
    + `administrative_area_level_2` for `data2`
* population size of `administrative_area_level_1` or `administrative_area_level_2` (`population`)
* numerous restrictions (`school_closing`, `cancel_events`,`gathering_restrictions`,...)
    + more info on the meaning of the restriction levels can be found at https://github.com/OxCGRT/covid-policy-tracker/blob/master/documentation/codebook.md#containment-and-closure-policies
* number of recovered cases (`recovered`)
* ...



Column
--------------------


### Structure {data-height=800}

```{r showData}
str(data2)
```








Exercise 1 {data-navmenu="Exercises ggplot" data-orientation=cols}
================================================================================================


Overview {.sidebar data-width=400}
-------------------------------------


### Overview

* **Data:** `data1`
* **Variables of interest:** `date`, `confirmed_daily`, `administrative_area_level_1`
* **Functions/tips**: `geom_line`, `scale_x_date`,`geom_smooth`




Column {.tabset .tabset-fade}
--------------------------

### Question & Code

Use `data1` to visualize the daily confirmed cases in Belgium over time with a *colored* line. Also make sure all months (with year) appear on the x-axis and give the graph a title.

```{r exercise1_solution1, echo=TRUE,eval=FALSE}
# Enter your solution here

ggplot(data1,aes(x=date,y=confirmed_daily,col=administrative_area_level_1)) + 
  geom_line() + 
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
  theme_bw() + labs(title="Daily confirmed cases in Belgium") +
  theme(axis.text.x = element_text(angle=60,hjust=1))

```

You will see that the line of confirmed daily cases jumps up and down consistently due to a *weekend effect*. Add a smooth line through these data points instead. (**Tip:** a span of 0.2 for *loess* gives a good result)

```{r exercise1_solution2,eval=FALSE, echo=TRUE}
# Enter your solution here

ggplot(data1,aes(x=date,y=confirmed_daily,col=administrative_area_level_1)) + 
  geom_line() + 
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
  theme_bw() + labs(title="Daily confirmed cases in Belgium") +
  theme(axis.text.x = element_text(angle=60,hjust=1)) + 
  geom_smooth(span=0.2, se=FALSE)

```

Finally, use your previous code to make 2 plots: one with the line fitting the data directly and one with the smoothed line. Combine these plots into a single plot grid using either the `gridExtra` or `cowplot`package.


### Plot

```{r exercise1_solution3, echo=FALSE,fig.width=14, eval=TRUE}
# Enter your solution here

p1 <- ggplot(data1,aes(x=date,y=confirmed_daily,col=administrative_area_level_1)) + 
  geom_line() + 
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
  theme_bw() + labs(title="Daily confirmed cases in Belgium") +
  theme(axis.text.x = element_text(angle=60,hjust=1))

p2 <- ggplot(data1,aes(x=date,y=confirmed_daily,col=administrative_area_level_1)) + 
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
  theme_bw() + labs(title="Daily confirmed cases in Belgium") +
  theme(axis.text.x = element_text(angle=60,hjust=1)) + 
  geom_smooth(span=0.2, se=FALSE)

# gridExtra solution
grid.arrange(p1,p2,ncol=2)
```







Exercise 2 {data-navmenu="Exercises ggplot" data-orientation=rows}
===============================================

Row {data-height=150}
---------------------

### Overview

* **Data:** `data2`
* **Variables of interest:** `date`, `hosp`, `administrative_area_level_2`
* **Functions/tips**: `geom_line`, `facet_wrap`, `scale_color_manual`


Row
------------


### Questions & Code

For this and all following exercises you will be using `data2`. Visualize the number of hospitalizations over time in the 3 main regions of Belgium. Make sure the 3 regions are separated in 3 facets and give each line (for each region) a manual color! (pick your favorite colors)


```{r exercise2_solution1, fig.width=10,echo=TRUE,eval=FALSE}
# Enter your solution here

ggplot(data2, aes(x=date, y=hosp, col=administrative_area_level_2)) + 
  geom_line(size=1) + facet_wrap(~administrative_area_level_2) + 
  theme_bw() + labs(y="Number of Hospitalizations") + 
  scale_color_manual(values=c("black","gold","red"))

```

Change your coloring variable to `gatherings_restrictions`. Make sure discrete colors are used and not a gradient! Also make sure that you obtain a single line with multiple colors and not multiple lines with a single color.
If you are interested in what the multiple level of restrictions mean, check out https://github.com/OxCGRT/covid-policy-tracker/blob/master/documentation/codebook.md#containment-and-closure-policies .
(**Tip:** `group=1`)

```{r exercise2_solution2, fig.width=10,echo=TRUE,eval=FALSE}
# Enter your solution here

ggplot(data2, aes(x=date, y=hosp, col=factor(gatherings_restrictions))) + 
  geom_line(size=1,aes(group=1)) + facet_wrap(~administrative_area_level_2) + 
  theme_bw() + labs(y="Number of Hospitalizations") +
  theme(axis.text.x = element_text(angle=60,hjust=1))

```


Row
----------

### Plot 1 {data-padding=10}

```{r exercise2_solution1A, fig.width=10,echo=FALSE,eval=TRUE}
# Enter your solution here

ggplot(data2, aes(x=date, y=hosp, col=administrative_area_level_2)) + 
  geom_line(size=1) + facet_wrap(~administrative_area_level_2) + 
  theme_bw() + labs(y="Number of Hospitalizations") + 
  scale_color_manual(values=c("black","gold","red"))

```

### Plot 2 {data-padding=10}


```{r exercise2_solution2A, fig.width=10,echo=FALSE,eval=TRUE}
# Enter your solution here

ggplot(data2, aes(x=date, y=hosp, col=factor(stay_home_restrictions))) + 
  geom_line(size=1,aes(group=1)) + facet_wrap(~administrative_area_level_2) + 
  theme_bw() + labs(y="Number of Hospitalizations")

```


Exercises plotly 
=========================


### Exercise 3

For this visualization first recreate a similar plot as in exercise 2 (you should be able to recycle most of your earlier code). However this time, color by region, do not use facets, and make sure the y-axis shows number of hospitalizations over the regional population size. The latter makes the values more comparable between regions.

* **Data:** `data2`
* **Variables of interest:** `date`, `hosp`, `administrative_area_level_2`, `population`
* **Functions/tips**: `geom_line`, `ggplotly`


Now transform your ggplot in an interactive plot and add the following additional tooltips: (1) Total number of hospitalizations, (2) regional population size, (3) any other number of variables that are of interest to you (e.g. restrictions)
(**Tip: ** `group=administrative_area_level_2`)


```{r exercise3_solution2, fig.width=10, echo=TRUE, eval=FALSE}
# Enter your solution here

p <- ggplot(data2, aes(x=date, y=hosp/population, col=administrative_area_level_2,
                       text=paste0("Total Hospitalized: ",hosp,"\nPopulation: ",population,"\nSchool Closing: ",school_closing,"\nGathering Restriction: ",gatherings_restrictions))) + 
  geom_line(size=1,aes(group=administrative_area_level_2)) + # Plotly needs to know what each line represents
  theme_bw() + labs(y="Number of Hospitalizations / Regional Population Size",title="Hospitalization in Belgium") +
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y")) + theme(axis.text.x = element_text(angle=60,hjust=1)) + 
  scale_color_manual(values=c("black","gold","red"))

ggplotly(p)
```




### Plot

```{r exercise3_solution2A, fig.width=10, echo=FALSE, eval=TRUE}
# Enter your solution here

p <- ggplot(data2, aes(x=date, y=hosp/population, col=administrative_area_level_2,
                       text=paste0("Total Hospitalized: ",hosp,"\nPopulation: ",population,"\nSchool Closing: ",school_closing,"\nGathering Restriction: ",gatherings_restrictions))) + 
  geom_line(size=1,aes(group=administrative_area_level_2)) + # Plotly needs to know what each line represents
  theme_bw() + labs(y="Number of Hospitalizations / Regional Population Size",title="Hospitalization in Belgium") +
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y")) + theme(axis.text.x = element_text(angle=60,hjust=1)) + 
  scale_color_manual(values=c("black","gold","red"))

ggplotly(p)
```



Exercises gganimate {data-orientation=rows}
========================



### Exercise 4

Use the static plot of exercise 3 and animate it however you see fit following one of the approaches in the course slides. You do not need to stick to the line plot.

* **Data:** `data2`
* **Variables of interest:** `date`, `hosp`, `administrative_area_level_2`, `population`
* **Functions/tips**: `geom_line`, `geom_point`, `transition_reveal`, `transition_time`, `transition_states`


```{r exercise4_solution1, fig.width=10, warning=FALSE, echo=TRUE, eval=FALSE}
# Enter your solution here

p1 <- ggplot(data2, aes(x=date, y=hosp/population, col=administrative_area_level_2)) + 
  geom_line(size=1) + geom_point(size=2) +
  theme_bw() + labs(y="Number of Hospitalizations / Regional Population Size",title="Hospitalization in Belgium: {frame_along}") +
  scale_color_manual(values=c("black","gold","red")) + 
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y")) + theme(axis.text.x = element_text(angle=60,hjust=1)) +
  transition_reveal(date)

animate(p1, width=800,height=400)

# Note: We could have just have used the ggplot statement without saving it in `p1` and then using `animate()`. However, doing it in 2 steps, gives us more control over animation options such as width, height, number of frames, rewind, fps,...

```


### Plot



```{r exercise4_solution1A, fig.width=10, warning=FALSE, echo=FALSE, eval=TRUE}
# Enter your solution here

p1 <- ggplot(data2, aes(x=date, y=hosp/population, col=administrative_area_level_2)) + 
  geom_line(size=1) + geom_point(size=2) +
  theme_bw() + labs(y="Number of Hospitalizations / Regional Population Size",title="Hospitalization in Belgium: {frame_along}") +
  scale_color_manual(values=c("black","gold","red")) + 
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y")) + theme(axis.text.x = element_text(angle=60,hjust=1)) +
  transition_reveal(date)

animate(p1, width=800,height=400)

# Note: We could have just have used the ggplot statement without saving it in `p1` and then using `animate()`. However, doing it in 2 steps, gives us more control over animation options such as width, height, number of frames, rewind, fps,...

```



Bonus Exercise {data-orientation=cols}
=================


Column {data-width=200}
------------


### Question 

From scratch, use `data3` to create a plot that shows the **deaths per 100.000 people for every week** over time for all of the included countries (Belgium, Czech Republic, France, Germany, Netherlands, United Kingdom). Either make this plot interactive by adding tooltips or animate with your favorite animation.  

You will need to do some data manipulation to add the daily deaths and the weekly deaths per 100.000. If you get stuck here, don't be afraid to head over to your best friend google/stackoverflow to find an easy/creative solution.


Column 
---------------------


### Code


```{r exercisebonus_solution, message=FALSE, warning=FALSE, fig.width=10,echo=TRUE,eval=FALSE}
# Enter your solution here

# Add daily deaths (the same as daily confirmed was added above)
data3 <- data3 %>% arrange(date) %>% group_by(administrative_area_level_1) %>% mutate(deaths_daily=c(.data$deaths[1],diff(deaths))) %>% ungroup()

# Add a date_weeks variable that is the first day at the beginning of every week. We use the round_date() function from the lubridate package for this.
# Then, we summarize the daily deaths over date_week
data3 <- data3 %>% mutate(date_week = lubridate::round_date(date,"week")) %>% 
  group_by(administrative_area_level_1, date_week) %>%
  mutate(deaths_week = sum(deaths_daily))


p <- ggplot(data3, aes(x=date_week, y=deaths_week/population*100000, col=administrative_area_level_1,
                  text=paste0("confirmed: ",confirmed,"\ndeaths: ",deaths))) + 
  geom_line(size=1,aes(group=administrative_area_level_1)) + 
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
  theme_bw() + labs(title="Weekly deaths per 100.000", y="Biweekly deaths per 100.000",x="") +
  theme(axis.text.x = element_text(angle=60,hjust=1))

ggplotly(p)

```


### Plot

```{r exercisebonus_solutionA, message=FALSE, warning=FALSE, fig.width=10}
# Enter your solution here

# Add daily deaths (the same as daily confirmed was added above)
data3 <- data3 %>% arrange(date) %>% group_by(administrative_area_level_1) %>% mutate(deaths_daily=c(.data$deaths[1],diff(deaths))) %>% ungroup()

# Add a date_weeks variable that is the first day at the beginning of every week. We use the round_date() function from the lubridate package for this.
# Then, we summarize the daily deaths over date_week
data3 <- data3 %>% mutate(date_week = lubridate::round_date(date,"week")) %>% 
  group_by(administrative_area_level_1, date_week) %>%
  mutate(deaths_week = sum(deaths_daily))


p <- ggplot(data3, aes(x=date_week, y=deaths_week/population*100000, col=administrative_area_level_1,
                  text=paste0("confirmed: ",confirmed,"\ndeaths: ",deaths))) + 
  geom_line(size=1,aes(group=administrative_area_level_1)) + 
  scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
  theme_bw() + labs(title="Weekly deaths per 100.000", y="Biweekly deaths per 100.000",x="") +
  theme(axis.text.x = element_text(angle=60,hjust=1))

ggplotly(p)

```